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On   Boscovich's    Estimator 


Roger    Koenker 

and 

Gilbert  Bassett 


1,   Introduction 

When  Gauss  discovered  least-squares  in  the  twilight  of  the  18tn 
century   there  were  already  several  well-established  proposals  for 
estimating  bivariate  linear  models.   Perhaps  the  best  known  of  these 
"precursors  of  least-squares"  is  the  proposal  of  R.oger  Boscovich  in 
1757  to  minimize  the  sum  of  absolute  residuals  subject  to  the  constraint 
that  the  mean  residual  is  zero. 

Boscovich's  proposal  attracted  the  attention  of  Thomas  Simpson,  a 

leading  English  I8th  century  analyst,  who  provided  a  partial  solution 
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to  the  problem  of  computing  the  Boscovich  estimate.    Subsequently,  in 

1799  Laplace  completed  characterized  the  solution  of  the  bivariate  com- 

3  — . 

putational  problem  as  a  weighted  median   with  weights  jx.  -  x |  of  the 

pairwise  slopes  s.  =  (y.  -  y)/(x.  -  x) ,  i  =  1,2,. ..,n. 

After  a  long  hiatus,  Edgeworth  (1887)  revived  the  idea  of  the 
Boscovich  estimator  calling  it  a  "remarkable  hybrid  between  the  Method 
of  Least  Squares  and  the  Method  of  Situation,"  the  latter  being 
Laplace's  rather  vague  term  for  I      methods.   In  the  next  section  we 
develop  an  asymptotic  theory  of  the  Boscovich  estimator  for  the  general 
linear  model  and  compare  its  asymptotic  behavior  with  that  of  some. of 
its  better  known,  but  less  venerable  competitors.   The  concluding  sec- 
tion suggests  some  possible  applications  of  the  theory  to  diagnostic 
testing  and  prediction  problems. 
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2.   Asymptotic  Theory  of  the  Boscovich  Estimator 
We  will  consider  the  classical  linear  model 

P 

(2.1)  y.  =   Z   x. .S .  +  u.  =  x.g  +  u. 

j=l   1J  J     1  1  1 

where  u. :   i  =  l,...,n,...  are  independent  with  common  distribution 
function  F( * ) ,  satisfying  F(l/2)  =0,  £u  =  y,  and  having  density  f 
which  is  continuous  and  strictly  positive  at  0  and  y.   The  design  will 
be  assumed  to  have  an  intercept:   explicitly  x   =  1  for  all  j,  and  to 
satisfy  the  usual  condition, 

(2.2)  lim  -  X'X  +  D 

n 
n->-«> 

for  a  positive  definite  matrix  D.   The  objective  function  of  the 
Boscovich  estimator  may  be  expressed  in  Lagrangian  form  as, 

(2.3)  Z[ |y  -  x.b|  +  X(yi  -  x.b)J. 
Reparameterizing,  set 

6Q  =  /n  (X  -  XQ) 

61  =  /n"  (b  -  8  -  yex) 

P 
where    ej    =    (1,0,..., 0)    e    R    ,    and   X      =    2F(y)    -   1.      Then    (2.3)    becomes, 

(2.4)  R(6)    =  Zlu.    -  x.SJSn  -pi    +   (Xn  +   5n//n)(u.    -  x.Sj/ri  -  u) 

'i    ll       '     0    0      l    il 

which  we  study  employing  the  methods  of  Ruppert  and  Carroll  (1980)  and 
Jureckova  (1977).   The  gradient  of  R  is, 


and , 


-3- 


/  Z  [u.  -  x. 5//n  -  \x] 


g(5)  =  VR(6)  =  ~ 
/n 


\ 


\-Z[sgn(u.  -  x.6  //a   -  u)    +   \     +  5  //rTjx. 


Eg(<$)  =  — 
/n 


-E  x.o  //n 


-Z[l  -  2F(x.5  //n  +  y)  +  X   +  6  //njx.. 


It  is  easily  shown  under  our  conditions  on  F  that  Eg(o)  has  a  unique 
root  at  6  =  0  which  following  Jureckova  (1977)  implies  that  <5  solving 

P  P 

(2.3)  is  0  (1)  and  hence  3  ♦  0  -  jie..  and  X  ->  X  .   Now  expanding  F 
around  5  =  0  and  setting  cj  =  2f(y),  yields, 


0    -x 


/  6, 


Eg(5)  = 


»D/  \  5X  / 


+  o  (1) 
P 


And  using  the  methods  of  Ruppert  and  Carroll  (1980)  we  have  for  fixed 
M  >  0 

sup    !lg(5)  -  g(0)  -  Eg(<5)  +  Eg(0)!l  =  o  (1) 
II  5  II  <M  P 


and  since  g(o)  =  o  (1)  and  Eg(0)  =  0  we  have  that 

P 


IEg(5n)  +  g(0) 


-  o  (1) 
P 


Now, 
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V(g(0))    =  V 


1_ 

/n 


/  E(u,  -  y) 


-£[sgn(u.  -  y)  +  2F(y)  -  ljx. 


G(y)x 


G(y)x'    H(M)D 


where  G(y)  =  E|u  -  u|,  and  H(y)  =  4(1  -  F(y))F(y).   Condition  2.2  and 
the  iid  assumption  on  the  errors  implies  that  the  summands  of  g(5) 
satisfy  the  Lindeberg  condition,  and  thus  6  converges  in  distribution 
to  a  p+1  variate  normal  distribution  with  mean  vector  0,  and  covariance 
matrix 


-x  \  ~1 


-x  wD 


Gx  \  /  0 


(G  +  uo2)e' 


-,  -1 
-x 


Gx'    HD  /  \  -xf    ooD 


H  +  2wG  +  co  a 


(G  +  too  )e| 


-2   -1 
u   H(D  " 


Ej^)  +  a' 


where  E  denotes  a  pxp  matrix  with  1  in  the  (1 ,l)-element  and  zeros 
elsewhere. 

To  interpret  the  result,  consider  first  the  symmetric  case  u  =  0, 


so 
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03  =  0)Q  =  2f(0) 


H(u)  =  4(1  -  F(0))F(0)  =  1 


and  we  have 


•  n(2  -  6)  +  N(0,  u02(D  L   -  E1)  +  a  E.^ 


Recall  that  the  unconstrained  £   estimator  under  tnose  conditions 

-2  -1 
is  asymptotically  normal  with  covariance  matrix  w   D   .   See  Bassett 

and  Koenker  (1978)  for  details.   Thus,  the  asymptotic  theory  of  the 

Boscovich  estimator,  8,  in  the  symmetric  case,  is  identical  to  that  of 

the  ususal  £   estimator  except  that  the  asymptotic  variance  of  the 

2  -2 

intercept  is  a    ,  the  variance  of  F,  instead  of  u   ,  the  asymptotic 

variance  of  the  normalized  sample  median  from  F.   This  seems  to  vindi- 

date  Edgeworth's  remark,  about  the  Boscovich  estimator  as  a  "remarkable 

hybrid"  between  £   and  £   methods. 

In  asymmetric  cases  8  ->■  6  -  ye   so  the  regression  surface  is 

shifted  to  the  conditional  expectation  of  y  rather  than  its  conditional 

median  as  for  the  unconstrained  £   estimator.   Secondly,  the  mean  of 

the  lagrangian  is  non-zero  in  the  asymmetric  case;  thus  a  test  for 

symmetry  based  on  the  lagrange  multiplier  is  possible.   The  covariance 

matrix  of  /n(3  -  8  -  lie  )  is  fundamentally  the  same  as  in  the  simple 

£  -case  except  that  the  scale  parameter  on  the  covariance  matrix  of  the 

-2  -2 

slope  parameters  is  (2f(y))    4(1  -  F(y))F(y)  instead  of  (2f(0)) 


3.   Applications 

There  are  two  applications  of  the  foregoing  theory  which  we  would 
like  to  discuss  briefly.   The  first  is  a  test  of  symmetry  of  the  error 
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distribution  in  linear  models  which  might  be  used  as  a  diagnostic  for 
I      regression.   The  second  is  an  application  to  prediction  problems. 
Consider  the  null  hypothesis,  H  :   u  =  0,  which  given  the  inter- 
cept in  the  linear  model  is  equivalent  to  Hi:   |j  =  F   (1/2)  and  repre- 
sents a  salient  necessary  condition  for  symmetry  of  the  errors.   If  we 
consider  local  alternatives  of  the  form 

then  a  test  of  JdL  is  available  using  the  test  statistic 

0 

A 

T  =  /nA//Q  +D  N(v,l) 

2  2  — 

where  Q  =  H  +  2wG  +  u  a   and  the  parameter,  v,  takes  the  form,  n  //Q. 

Unfortunately,  while  finding  a  consistent  estimator  of  Q  is  quite 
easy — one  could  use  residuals  from  the  £  -regression,  or  the  empirical 
distribution  function  proposed  in  Bassett  and  Koenker  (1982) — a  reason- 
able estimate  for  small  to  modest  size  samples  seems  problematic. 

A  second,  and  perhaps  more  promising  application  of  the  Boscovich 
estimator  is  to  prediction  problems  for  linear  models.   A  possible 
objection  to  I     methods  for  prediction  is  their  failure  to  predict  the 
conditional  expectation  of  the  response  variable  in  asymmetric  error 
situations.   While  a  reasonable  argument  might  be  made  for  conditional 
median  predictions,  strict  adherence  to  quadratic  loss,  for  example, 
dictates  prediction  of  conditional  expectations.   Nevertheless,  to  pro- 
tect one's  self  against  the  consequences  of  heavy-tailed  errors,  one 
might  prefer  an  estimation  method  which  achieved  median  precision  for 
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the  slope  parameters,  while  sacrificing  this  precision  for  the  inter- 
cept to  remove  the  median  bias  effect*   This  is,  in  effect,  what  the 

3oscovich  estimator  achieves.   It  is  easy  to  construct  examples  for 

4 
which  it  is  preferred  to  both  its  I      and  I      competitors. 

Finally,  we  might  add  that  nothing  we  have  done  depends  crucially 

on  the  form  of  the  Boscovicn  estimator  and  could  easily  be  extended  Co 

problems  of  the  general  form, 


min  Ep(y.  -  x.b)  -  \^(y.  -  x.b). 
bslR1      1    1  X  1 


for  p  and  $  corresponding  to  any  plausible  m-estimators . 
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Footnotes 

We  begin  on  a  note  of  controversy.   See  Plackett  (1972)  and  Stigler 
(1981)  for  discussions  of  the  least-squares  priority  debate  between 
Gauss  and  Legendre . 

2 
Stigler  (1984)  offers  a  fascinating  glimpse  of  the  Boscovich- 

Simpon  interchange,  and  describes  an  unpublished  (1760)  fragment  in 

which  Simpson  develops  his  approach  to  the  Boscovich  problem.   See 

Harter  (1974)  and  Stigler  (1973)  for  further  background. •  - 

3 
The  term  "weighted  median"  is  apparently  due  to  Edgeworth.   Given 

an  ordered  sample  s.,...,s  ,  and  associated  weights,  w,  ,...,w  ,  the 

In  In 

J  n 

weighted    median   is    simply    s      such    that    m  =   min    fj  |    2    |w.  j    >      E    jw. |/2}. 

i=l  i=l 

4Take  D  =  I  ,  x'  =  (1,1)  so  x'D~1x  =  2.   We  need  F(y)(l  -  F(u))/f(u2) 

2 
<  a^(F)*      This  is  satisfied  for  the  Pareto  distribution  with  parameter 

a  =  3,  for  which  F(u)  =  1  -  u~a  =  19/27,  f(y)  =  3u_t+  =  16/27,  p  =  3/2, 

„2-i. 
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